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Abstract 

The Lovasz Local Lemma (LLL) is a powerful tool that gives sufficient conditions for avoiding all of 
a given set of "bad" events, with positive probability. A series of results have provided algorithms to 
efficiently construct structures whose existence is non-constructively guaranteed by the LLL, culminating 
in the recent breakthrough of Moser & Tardos for the full asymmetric LLL. We show that the output 
distribution of the Moser- Tardos algorithm well-approximates the conditional LLL-distribution - the 
distribution obtained by conditioning on all bad events being avoided. We show how a known bound on 
the probabilities of events in this distribution can be used for further probabilistic analysis and give new 
constructive and non-constructive results. 

We also show that when a LLL application provides a small amount of slack, the number of resamplings 
of the Moser- Tardos algorithm is nearly linear in the number of underlying independent variables (not 
events!), and can thus be used to give efficient constructions in cases where the underlying proof applies 
the LLL to super-polynomially many events. Even in cases where finding a bad event that holds is 
computationally hard, we show that applying the algorithm to avoid a polynomial-sized "core" subset 
of bad events leads to a desired outcome with high probability. This is shown via a simple union bound 
over the probabilities of non-core events in the conditional LLL-distribution, and automatically leads to 
simple and efficient Monte-Carlo (and in most cases RNC) algorithms. 

We demonstrate this idea on several applications. We give the first constant-factor approximation 
algorithm for the Santa Glaus problem by making a LLL-based proof of Feige constructive. We pro- 
vide Monte Carlo algorithms for acyclic edge coloring, non-repetitive graph colorings, and Ramsey-type 
graphs. In all these applications, the algorithm falls directly out of the non-constructive LLL-based proof. 
Our algorithms are very simple, often provide better bounds than previous algorithms, and are in several 
cases the first efficient algorithms known. 

As a second type of application we show that the properties of the conditional LLL-distribution can be 
used in cases beyond the critical dependency threshold of the LLL: avoiding all bad events is impossible 
in these cases. As the first (even non-constructive) result of this kind, we show that by sampling a 
selected smaller core from the LLL-distribution, we can avoid a fraction of bad events that is higher than 
the expectation. MAX fc-SAT is an illustrative example of this. 
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1 Introduction 



The well-known Lovasz Local Lemma (LLL) [24] is a powerful probabilistic approach to prove the existence of 
certain combinatorial structures. Its diverse range of applications include breakthroughs in packet-routing 
[33], a variety of theorems in graph-coloring including list coloring, frugal coloring, total coloring, and 
coloring graphs with lower-bounded girth [38], as well as a host of other applications where probability 
appears at first sight to have no role [10]. Furthermore, almost all known applications of the LLL have no 
alternative proofs known. While the original LLL was non-constructive - it was unclear how the existence 
proofs could be turned into polynomial-time algorithms - a series of works [17, 1, 23, 37, 38, 46, 41, 39, 40] 
beginning with Beck [17] and culminating with the breakthrough of Moser & Tardos (MT) [40] have led to 
efficient algorithmic versions for most such proofs. However, there are several LLL applications to which 
these approaches inherently cannot apply; our work makes progress toward bridging this gap, by uncovering 
and exploiting new properties of [40]. We also obtain what are, to our knowledge, the first algorithmic 
applications of the LLL where a few of the bad events have to happen, and where we aim to keep the number 
of these small. 

We will use standard notation: e denotes the base of the natural logarithm, and In and log denote the 
logarithm to the base e and 2, respectively. 

Essentially all known applications of the LLL use the following framework. Let V he a collection of n 
mutually independent random variables {Pi, ■ ■ , Pn}, and let A = {Ai, A2, . . . , Aj^} be a collection of m 
("bad") events, each determined by some subset of P. The LLL (Theorem 1.1) shows sufficient conditions 
under which, with positive probability, none of the events A^ holds: i.e., that there is a choice of values for 
the variables in P (corresponding to a discrete structure such a suitable coloring of a given graph) that avoids 
all the Ai. Under these same sufficient conditions, MT shows the following very simple algorithm to make 
such a choice: (i) initially choose the Pi independently from their given distributions; (ii) while the current 
assignment to P does not avoid all the Ai, repeat: arbitrarily choose a currently-true Ai, and resample, from 
their product distribution, the variables in P on which Ai depends. The amazing aspect of MT is that the 
expected number of resamplings is small [40]: at most poly(n,TO) in all known cases of interest. However, 
there are two problems with implementing MT, that come up in some applications of the LLL: 

(a) the number of events m can be supcrpolynomial in the number of variables n; this can result in a 

superpolynomial running time in the "natural" parameter n ^; and, even more seriously, 

(b) given an assignment to P, it can be computationally hard (e.g., NP-hard or yet-unknown to be in 

polynomial time) to either certify that no Ai holds, or to output an index i such that Ai holds. 

Since detection and resampling of a currently-bad event is the seemingly unavoidable basic step in the MT 
algorithm, these applications seemed far out of reach. We deal with a variety of applications wherein (a) 
and/or (b) hold, and develop Monte Carlo (and in many cases, RNC) algorithms whose running time is 
polynomial in n: some of these applications involve a small loss in the quality of the solution. (We loosely let 
"RNC algorithms" denote randomized parallel algorithms that use poly(n) processors and run in polylog(n) 
time, to output a correct solution with high probability.) First we show that the MT algorithm needs only 
0{n^ logn) many resampling steps in all applications that arc known (and in most cases 0{n ■ polylog(n))), 
even when m is superpolynomial in n. This makes those applications constructive that allow an efficient 
implicit representation of the bad events (in very rough analogy with the usage of the ellipsoid algorithm for 
convex programs with exponentially many constraints but with good separation oracles). Still, most of our 
applications have problem (b). For these cases, we introduce a new proof-concept based on the (conditional) 
LLL- distribution - the distribution D onP that one obtains when conditioning on no Ai happening. Some 
very useful properties are known for D [ lO]: informally, if B depends "not too heavily" on the events in A, 
then the probability placed on i? by D is "not much more than" the unconditional probability Pr[_B]: at 

^ n is the parameter of interest since the output we seek is one value for each of Pi , P2 , . . . , Pn ■ 
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most fA{B) ■ Pr[B] (see (3)). Such bounds in combination with further probabihstic analysis can be used 
to give interesting (nonconstructive) results. Our next main contribution is that the MT algorithm has an 
output distribution (say D') that "approximates" the LLL-distribution D: in that for every B, the same 
upper bound fA{B) ■ Pr [B] as above, holds in D' as well. This can be used to make probabilistic proofs that 
use the LLL-condition constructive. 

Problem (b), in all cases known to us, comes from problem (a): it is easy to test if any given Ai holds 
currently (e.g., if a given subset of vertices in a graph is a clique), with the superpolynomiality of m being 
the apparent bottleneck. To circumvent this, we develop our third main contribution: the very general 
Theorem 3.4 that is simple and directly applicable in all LLL instances that allow a small slack in the LLL's 
sufficient conditions. This theorem proves that a small poly(n)-sized core-subset of the events in A can be 
selected and avoided efficiently using the MT algorithm. Using the LLL-distribution and a simple union 
bound over the non-core events, we get efficient (Monte Carlo and/or RNC) algorithms for these problems. 

We develop two types of applications, as sketched next. 
1.1 Applications that avoid all bad events 

A summary of four applications follows; all of these have problem (a), and all but the acyclic-coloring 
application have problem (b). Most such results have RNC versions as well. 

The Santa Glaus Problem: The Santa Clans problem is the restricted assignment version of the max-min 
allocation problem of indivisible goods. The Santa Claus has n items that need to be distributed among m 
children. Each child has a utility for each item, which is either or some given pj for item j. The objective 
is to assign each item to some child, so that the minimum total utility received by any child is maximized. 
This problem has received much attention recently [15, 14, 26, 13, 16, 20]. The problem is NP-Hard and the 
best-known approximation algorithm due to Bansal and Sviridenko [15] achieves an approximation factor of 
^( iog'io'°iog'm ) rounding a certain configuration LP. Later, Feige in [26] and subsequently Asadpour, Feige 
and Sabcri in [ 1 3] showed that the integrality gap of the configuration LP is a constant. Surprisingly, both 
results were obtained using two different non-constructive approaches and left the question for a constant- 
factor approximation algorithm open. This made the Santa Claus problem to one of the rare instances [27] 
in which the proof of an integrality gap did not result in an approximation algorithm with the same ratio. 
In this paper wc resolve this by making the non-constructive LLL-bascd proof of Feige [2(1] constructive 
(Section 4) and giving the first constant-factor approximation algorithm for the Santa Claus problem. 

Non-repetitive Coloring of Graphs: Given a graph H = (V,E), a fc-coloring (not necessarily proper) of the 
edges of H is called non-repetitive if the sequence of colors along any simple path is not the same in the first 
and the second half. The smallest k such that H has a non-repetitive /c-coloring is called the Thue number 
tt{H) of H [47]. Alon, Grytczuk, Hauszczak and Riordan showed via the LLL that 7t{H) < 0(A(i7)^) 
[5], where A is the maximum degree of any vertex in H. This was followed by much additional works 
[22, 45, 29, 32, 19, 4]. However, no efficient construction is known till date, except for special classes of 
graphs such as complete graphs, cycles and trees. We present a randomized algorithm for non-repetitive 
coloring of H using at most 0(A(i/)^+'^) colors, for every constant e > (Section 5). 

General Ramsey-Type Graphs: The Ramsey number R(Us, Vt) refers to the smallest n such that any graph 
on n vertices either contains a Us within any subgraph of s vertices, or there exist t vertices that do not 
contain Vt. Obtaining lower bounds for various special cases of R{Us,Vt) and constructing Ramsey type 
graphs have been studied in much detail [2, 9, 31, 7]. A predominant case for such problems is when s is held 
fixed. We consider the general setting of R{Us, Vt) with fixed s, and provide efficient randomized algorithms 
for constructing Ramsey- type graphs (Section 6). 

Acyclic Edge- Coloring: A proper edge-coloring of a graph is acyclic iff each cycle in it receives more than 2 
colors. The acyclic chromatic number a{G) introduced in [2n] is the minimum number of colors in a proper 
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acyclic edge coloring of G [8, 37, 11, 28, 42]. Alon, McDiarmid and Reed [S] showed that a{G) < 64A, where 
A is the maximum degree. The constant was later improved to 16 by MoUoy and Reed [37], who also mention 
an algorithmic version using 20A colors. However it was conjectured that a{G) = A + 2; Alon, Sudakov and 
Zaks showed indeed the conjecture is true for graphs having girth r2(Alog A) [11]. Their algorithm can be 
made constructive using Beck's technique [17] to obtain an acyclic edge coloring using A + 2 colors, albeit for 
graphs with girth significantly larger than 0(AlogA) [11]. We bridge this gap by providing constructions 
to achieve the same girth bound as in [11], yet obtaining an acyclic edge coloring with only A + 2 colors. For 
graphs with no girth bound, 16A colors suffice to efficiently construct an acyclic edge coloring in contrast to 
the 20 A algorithmic bound of [37] (Section 7). 

The recent result of Matthew Andrews on approximating the edge-disjoint paths problem on undirected 
graphs is another example, where problems (a) and (b) occur and our LLL-tcchniques are applied to avoid 
supcr-polynomially many bad events [12]. 

1.2 Applications that avoid many bad events 

Many settings require "almost all" bad events to be avoided, and not necessarily all; e.g., consider MAX-SAT 
as opposed to SAT. However, in the LLL context, essentially the only known general applications were "all 
or nothing": either the LLL's sufficient conditions hold, and we are able to avoid all bad events, or the LLL's 
sufficient conditions are violated, and the only known bound on the number of bad events is the trivial one 
given by the linearity of expectation (which does not exploit any "almost-independence" of the bad events, 
as does the LLL). This situation is even more pronounced in the algorithmic setting. We take what are, to 
our knowledge, the first steps in this direction, interpolating between these two extremes. 

While our discussion here holds for all applications of the symmetric LLL, let us take MAX-fc-SAT as an 
illustrative example. (The LLL is stated in Section 1.3, but let us recall its well-known "symmetric" special 
case: in the setting of MT with V and A as defined near the beginning of Section 1, if Pr [Ai] < p and Ai 
depends on at most d other Aj for all i, then e • p • (d -I- 1) < 1 suffices to avoid all the Ai.) Recall that in 
MAX-A:-SAT, we have a CNF formula on n variables, with m clauses each containing exactly k literals; as 
opposed to SAT, where we have to satisfy all clauses, we aim to maximize the number of satisfied clauses 
here. The best general upper-bounds on the number of "violated events" (unsatisfied clauses) follow from 
the probabilistic method, where each variable is set to True or False uniformly at random and independently. 
On the one hand, the linearity of expectation yields that the expected number of unsatisfied clauses is 
(with a derandomization using the method of conditional probabilities). On the other hand, if each clause 
shares a variable with at most 2'^/e — 1 other clauses, a simple application of the symmetric LLL shows 
that all clauses can be satisfied (and made constructive using MT). No interpolation between these was 
known before; among other results, we show that if each clause shares a variable with at most ^ a2^ /e other 
clauses for 1 < a < e, then we can efficiently construct an assignment to the variables that violates at most 
(eln(Q;)/a -I- o(l)) • m ■ clauses for large k. (This is better than the linearity of expectation iff a < e: it 
is easy to construct examples with a = e where one cannot do better than the linearity of expectation. See 
[()] for the fixed-parameter tractability of MAX-fc-SAT above (1 — 2~^)m satisfied clauses.) 

The above and related results for applications of the symmetric LLL, follow from the connection to the 
"further probabilistic analysis using the remaining randomness of LLL-distributions" that we alluded to 
above; see Section 8. We believe this connection to be the main conceptual message of this paper, and 
expect further applications in the future. 

1.3 Preliminaries & Algorithmic Framework 

We follow the general algorithmic framework of the Local Lemma due to MT. As in our description at the 
beginning of Section 1, let 'P be a finite collection of mutually independent random variables {Pi, P2, . . . , Pn} 
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and let A = {Ai,A2, . . . , A„i} be a collection of events, each determined by some subset of V. For any event 
B that is determined by a subset of V we denote the smallest such subset by vbl(i3). For any event B that 
is determined by the variables in V, we furthermore write T{B) = r^(i?) for the set of all events A B in 
A with vbl(A) n vbl(i3) 7^ 0. This neighborhood relation induces the following standard dependency graph 
or variable-sharing graph on A: For the vertex set ^ let G = be the undirected graph with an edge 
between events A, B G A iS A G ^{B). We often refer to events in A as bad events and want to find a point 
in the probability space, or equivalently an assignment to the variables V, wherein none of the bad events 
happen. We call such an assignment a good assignment. 

With these definitions the general ("asymmetric") version of the LLL simply states: 

Theorem 1.1 (Asymmetric Lovasz Local Lemma). With A,V and F defined as above, if there exists an 
assignment of reals a; : ^ — >■ (0, 1) such that 

yAeA:Pr[A]<x{A) \{ [1 - x{B))- (1) 
Ber{A) 

then the probability of avoiding all had events is at least WAeA{^ ^ ^{^)) > t*^*^ thus there exists a good 
assignment to the variables in V . 



Wc study several LLL instances where the number of events to be avoided, m, is super-polynomial in n; our 
goal is to develop algorithms whose running time is polynomial in n which is also the size of the output - 
namely a good assignment of values to the n variables. We introduce a key parameter: 

J:=minx(A) [] {l-x{B)). (2) 
ser(A) 

Note that without loss of generality <5 < ^ because otherwise &\\ A G A arc independent, i.e., defined on 
disjoint sets of variables. Indeed if (5 > i and there is an edge in G between A G A and B £ A then we 
have i > x{A){l - x{B)) and \ > x{B){l - x^A)), i.e., | • | > xiA){l - x{A)) ■ x{B){l - x{B)) which is a 
contradiction because x{l — x) < j for all x (the maximum is attained at a; = i). 

We allow our algorithms to have a running-time that is polynomial in log(l/(5); in all applications known 
to us, 5 > exp(— 0(n log n)), and hence, log(l/(5) = O(nlogn). In fact because S is an upper bound for 
minA^A -P(^) in any typical encodings of the domains and the probabilities of the variables, \og{l/S) will be 
at most linear in the size of the input or the output. 

The following subsection 1.4 reviews the MT algorithm and its analysis, which will be helpful to understand 
some of our proofs and technical contributions; the reader familiar with the MT algorithm may skip it. 



1.4 Review of the MT Algorithm and its Analysis 

Recall the resampling-based MT algorithm; let us now review some of the technical elements in the analysis 
of this algorithm, that will help in understanding our technical contributions better. 

A witness tree r = (T, ar) is a finite rooted tree T together with a labeling ar ■ V{T) — ?> .4 of its vertices 
to events, such that the children of a vertex u € ViT) receive labels from F(o't(u)) U ariu). In a proper 
witness tree distinct children of the same vertex always receive distinct labels. The "log" C of an execution 
of MT lists the events as they have been selected for resampling in each step. Given G, we can associate 
a witness tree Tc{t) with each resampling step t that can serve as a justification for the necessity of that 
correction step. Tc{t) will be rooted at C{t). A witness tree is said to occur in G, if there exists t £ N, such 
that Tc{t) = T. It has been shown in [40] that if t appears in G, then it is proper and it appears in G with 
probability at most Tlvev(T)Pf i'^Tiv)]. 
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To bound the running time of the MT algorithm, one needs to bound the number of times an event A E Ais 
resampled. If Na denotes the random variable for the number of resampling steps of A and C is the execution 
log; then Na is the number of occurrences of A in this log and also the number of distinct proper witness 
trees occurring in C that have their root labeled A. As a result one can bound the expected value of Na 
simply by summing the probabilities of appearances of distinct witness trees rooted at A. These probabilities 
can be related to a Galton- Watson branching process to obtain the desired bound on the running time. 

A Galton- Watson branching process can be used to generate a proper witness tree as follows. In the first 
round the root of the witness tree is produced, say it corresponds to event A. Then in each subsequent 
round, for each vertex v independently and again independently, for each event B G TaTiv) U (Tt(w), B is 
selected as a child of v with probability x{B) and is skipped with probability (1 — xb)- We will use the 
concept of a proper witness trees and Galton- Watson process in several of our proofs. 



2 LLL-Distribution 



When trying to turn the non-constructive Lovasz Local Lemma into an algorithm that finds a good assign- 
ment the following straightforward approach comes to mind: draw a random sample for the variables in V 
until one is found that avoids all bad events. If the LLL-conditions are met this rejection-sampling algo- 
rithm certainly always terminates but because the probability of obtaining a good assignment is typically 
exponentially small it takes an expected exponential number of resamplings and is therefore non-efficient. 
While the celebrated algorithm of Moser (and Tardos) is much more efficient, the above rejection-sampling 
method has a major advantage: it does not just produce an arbitrary assignment but provides a randomly 
chosen assignment from the distribution that is obtained when one conditions on no bad event happening. 
In the following, we call this distribution LLL-distribution or conditional LLL- distribution. 

The LLL-conditions and further probabilistic analysis can be a powerful tool to obtain new results (con- 
structive or otherwise) like the constructive one in Section 8. The following is a well-known bound on the 
probability Pro [B] that the LLL-distribution D places on any event B that is determined by variables in V 
(its proof is an easy extension of the standard non-constructive LLL-proof [10]): 

Theorem 2.1. If the LLL-conditions from Theorem 1.1 are met, then the LLL-distribution D is well-defined. 
For any event B that is determined by V , the probability Pr£)[B] of B under D satisfies: 



Pro [B] := Pr 



B\ f\A 



<Pr[B\- n {l-xc)-'; (3) 



here, Pr [B] is the probability of B holding under a random choice of Pi, P2, . . . , Pn- 



The fact that the probability of an event B does not increase much in the conditional LLL-distribution when 
B does not depend on "too many" C G ^, is used critically in the rest of the paper. 

More importantly, the following theorem states that the output distribution D' of the MT-algorithm ap- 
proximates the LLL-distribution D and has the very nice property that it essentially also satisfies (3): 

Theorem 2.2. Suppose there is an assignment of reals x : (0,1) such that (1) holds. Let B be any 

event that is determined by V . Then, the probability that B was true at least once during the execution of 
the MT algorithm on the events in A, is at most Pr [B] ■ {Y[cer(B)i^ ~ ^c))"^ ■ If^ particular the probability 
of B being true in the output distribution of MT obeys this upper-bound. 



Proof. The bound on the probability of B ever happening is a simple extension of the MT proof [4( )] . Note 
that we want to prove the theorem irrespective of whether B is in A or not. In either case we are interested 
in the probability that the event was true at least once during the execution, i.e., if _B is in ^ whether it 
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could have been resampled at least once. The witness trees that certify the first time B becomes true are 
the ones that have S as a root and all non-root nodes from A \ {B}. Similarly as in [40], we calculate the 
expected number of these witness trees via a union bound. Let r be a fixed proper witness tree with its root 
vertex labeled B. Following the proof of Lemma 3.1 and using the fact that B cannot be a child of itself, 
it can be shown that the probability pr with which the Galton- Watson process that starts with B yields 
exactly the tree r is Pr = nA6r(S)(-'^ ~ ^i^)) ' Y[vev{r) ^'{'^v)- Here V{t) are the non-root vertices of t and 
x'{ay) = x{ay)Y[c^r{(7 )(^ ~ ^(C*))- Plugging this in the arguments following the proof of Lemma 3.1 of 
[40] it is easy to see that the union bound over all these trees and therefore also the desired probability is at 
most Pr \B\ ■ (ncGr(B)(l ~ a^c))~"^ where the term "Pr[S]" accounts for the fact that the root-event B has 
to be true as well. I 

Using this theorem we can view the MT algorithm as an efficient way to obtain a sample that comes 
approximately from the conditional LLL-distribution. This efficient sampling procedure makes it possible to 
make proofs using the conditional LLL-distribution constructive and directly convert them into algorithms. 
All constructive results of this paper are based on Theorem 2.2 and demonstrate this idea. 

3 LLL Applications with Super-Polynomially Many Bad Events 

In several applications of the LLL, the number of bad events is super-polynomially larger than the number 
of underlying variables. In these cases we aim for an algorithm that still runs in time polynomial in the 
number of variables, and it is not efficient to have an explicit representation of all bad events. Surprisingly, 
Theorem 3.1 shows that the number of resamplings done by the MT algorithm remains quadratic and in 
most cases even near-linear in the number of variables n. 

Theorem 3.1. Suppose there is an e € [0, 1) and an assignment of reals a; : ,4 — > (0, 1) such that: 

yAeA:Pr[A]<{l-e)x{A) J| {l-x{B)). 

Ber{A) 

With 6 denoting min^igyi .t(A) nser(A)(-'^ ~ ^i^))' have 

T:= ^x^<nlog(l/5). (4) 

Furthermore: 

1. if e ~ 0, then the expected number of resamplings done by the MT algorithm is at most vi = 
T vaayiAeA i-x(a) ' o-'^^d for any parameter A > 1, the MT algorithm terminates within \vi resamplings 
with probability at least 1 — 1/A. 

2. if e > 0, then the expected number of resamplings done by the MT algorithm is at most V2 = 0(— log — ), 
and for any parameter A > 1, the MT algorithm terminates within \v2 resamplings with probability 
1 — exp(— A). 

Proof. The main idea of relating the quantity T to n and 5 is to use: (i) the fact that the variable-sharing 
graph G is very dense, and (ii) the nature of the LLL-conditions which force highly connected events to have 
small probabilities and x-values. To see that G is dense, consider for any variable P G V the set of events 

Ap^{AeA\Pevh\{A)}, 
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and note that these events form a chque in G. Indeed, the m vertices of G can be partitioned into n such 
chques with potentially further edges between them, and therefore has at least n ■ ("2") = /{2n) — to/2 
edges, which is high density for n. 

Let us first prove the bound on T. To do so, we fix any P ^ V and show that X^se^p < log(l/i5), 
which will clearly suflace. Recall from the discussion following (2) that we can assume w.l.o.g. that 5 < j. 
If \Ap\ = 1, then of course X^bg^p < 1 < log(l/'^)- If \^p\ > 1; ^ -^p have the smallest xa value. 
Note that by definition 

s<xA n (i-2:i3)=7^^ n (i-^s). 

BeAp\A ^ BeAp 

If XA < 1/2, then S < UseApi^ - ^b) < e~^«^^^"^, and we get EseAp ^b < In (1/(5) < log(l/(5) as 
required. Otherwise, if xa > 1/2, let Bi E Ap \ A. Then, 

S<XA- n i^-XB) = XA{l~XBj n (l-xs) <a;A(l-XBje-^«e^^\(^-«i)"«. (5) 

BeAp\A BeAp\{AuBi) 

Let us now show that for 1/2 < xa < xb^ < 1, 

XAil-XB,)<e-'-''^+'^^^\ (6) 

Fix XA- We thus need to show e^^i (1 — xbi) < ^^],^a ■ "^^^ derivative of e^^i(l — xb^) is negative for 
xbi > 0, showing that it is a decreasing function in the range xbi € [xai 1]- Therefore the maximum value 
of e^^^i (1 — xbi) is obtained at xbi = xa and for (6) to hold, it is enough to show that, 2:^(1 — 2:^1) < e'^^-^ 
holds. The second derivative of e"^"^-^ — xa(1 — xa) is positive. Differentiating e^^^^ ~ xa{1 — xa) and 
equating the derivative to 0, returns the minimum in [1/2,1] at xa = 0.7315. The minimum value is 
0.0351 > 0. Thus we have (6) and so we get 

xa{1 - XBi)e~^««^^J'\<^u«i)^'' < e'^^'^^p'"'; 
using this with (5), we obtain J^BeAp ^b < In (1/(5) < log(l/(5) as desired. 

Given the bound on T, part (1) follows directly from the main theorem of [40] and by a simple application 
of Markov's inequality. 

Part (2) now also follows from [40]. In section 5 of [40] it is shown that saving a (1 — e) factor in the probability 
of every resampling step implies that with high probability, no witness tree of size log '^agA i-xa 
occurs. This easily implies that none of the n variables can be resampled more often. It is furthermore 
shown that without loss of generality all cc-values can be assumed to be bounded away from 1 by at least 
0(e). This simplifies the upper bound on the expected running time to n ■ 0(i log ^). I 

As mentioned following the introduction of S in (2), log(l/(5) < 0(n log n) in all applications known to us, 
and is often even smaller. 



Remarks 

• The max r—- factor in the running time of part (1) of Theorem 3.1 corresponds to the expected 

AeA 1 — x(A) 

number of times the event A gets resampled until one satisfying assignment to its variables is found. 
It is obviously unavoidable for an algorithm that has only black-box resampling and evaluation access 
to the events. If one alters the algorithm to pick a random assignment that satisfies A (which can for 
example be computed using rejection sampling, taking an expected 8( i_^(^a} ) l^'ials each time), this 
factor can be avoided. 
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• The estimation T = J2AeA-'^A = 0{n\ogl/S) is tight and can be achieved, e.g., by having an iso- 
lated event with constant probabihty for each variable. In many cases with log 1/(5 — w(logn) it is 
nevertheless an overestimate, and in most cases the running time is O(nlogn) even for e ~ 0. 

While Theorem 3.1 gives very good bounds on the running time of MT even for applications with Q{n) < 
m < poly(n) many events, it unfortunately often fails to be directly applicable when m becomes super- 
polynomial in n. The reason is that maintaining bad events implicitly and running the resampling process 
requires an efficient way to find violated events. In many examples like those of Section 4, 5 and 6 with 
super-polynomially many events, finding violated events or even just verifying a good assignment is not 
known to be in polynomial time (often even provably NP-hard). To capture the sets of events for which we 
can run the MT algorithm efficiently we use the following definition: 

Definition 3.2. (Efficient verifiability) A set A of events that are determined by variables in V is 
efficiently verifiable if, given an arbitrary assignment to V, we can efficiently find an event A A that holds 
or detect that there is no such event. 

Because many large A of interest are not efficiently verifiable, a direct application of the MT-algorithm is not 
efficient. Nevertheless we show in the rest of this section that using the randomness in the output distribution 
of the MT-algorithm characterized by Theorem 2.2, it is still practically always possible to obtain efficient 
Monte Carlo algorithms that produce a good assignment with high probability. 

The main idea is to judiciously select an efficiently verifiable core subset .4' C ^ of bad events and apply the 
MT-algorithm to it. Essentially instead of looking for violated events in A we only resample events from A' 
and terminate when we cannot find one such violated event. The non-core events will have small probabilities 
and will be sparsely connected to core events and as such their probabilities in the LLL-distribution and 
therefore also the output distribution of the algorithm does not blow up by much. There is thus hope that 
the non-core events remain unlikely to happen even though they were not explicitly fixed by the algorithm. 
Theorem 3.3 shows that if the LLL-conditions are fulfilled for A then a non-core event A G A\A' is violated 
in the produced output with probability at most xa- This makes the success probability of such an approach 
at least 1 — ^ xa- 

AeA\A' 

Theorem 3.3. Let A' Q A be an efficiently verifiable core subset of A. If there is an e € [0, 1) and an 
assignment of reals a; : ^ — >■ (0, 1) such that: 

\fAeA:Pr[A]<{l-e)x{A) [| (1 - 

Ber{A)nA' 

Then the modified MT-algorithm can be efficiently implemented with an expected number of resamplings 
according to Theorem 3.1. The algorithm furthermore outputs a good assignment with probability at least 
1- ''A. 

A(iA\A' 

Proof. Note that the set A! on which the actual MT-algorithm is run fulfills the LLL-conditions. This 
makes Theorem 3.1 applicable. To argue about the success probability of the modified algorithm, note that 
xi^A) > Pr [A] Y[BGr'{A)i^ ~ ^i^)) where r'{A) are the neighbors of A in the variable sharing graph defined 
on A'. Using Theorem 2.2 we get that the probability that a non-core bad event A E ^ \ holds in the 
assignment produced by the modified algorithm is at most xa- Since core-events are avoided completely by 
the MT-algorithm a simple union bound over all conditional non-core event probabilities results in a failure 
probability of at most J2a£A\A' -"^a- 

Here is furthermore a direct proof of the theorem incorporating the argument from Theorem 2.2 into the 
proof: 
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Redefine the witness trees of [40] to have only events from in non-root nodes, thus getting a modification 
of the Galton- Watson process from Section 3 of [40] . As in [40] , we grow witness trees from an execution-log 
starting with a root event that holds at a certain point in time. This guarantees that we capture events 
A ^ A\J^ happening even though they are never resampled (since we never check whether such events A 
hold or not). Note that if some As A\J^ holds after termination, then there is a witness tree with A as root 
and with all non-root nodes belonging to . Following the proof of Lemma 3.1 from [40] the probability for 
this to happen is at most "YIiAi^axA' ^a- (Wc do not get "^aeAXA' ^^/(l ~ ^a), since A cannot be a child of 
itself in the witness trees that we construct.) I 



While the concept of an efficiently verifiable core is easy to understand, it is not clear how often and how 
such a core can be found. Furthermore having such a core is only useful if the probability of the non- 
core events is small enough to make the failure probability, which is based on the union bound over those 
probabilities, meaningful. The following main theorem shows that in all applications that can tolerate a 
small "exponential" e-slack as introduced by [21], finding such a good core is straightforward: 

Theorem 3.4. Suppose there is a fixed constant e G (0, 1) and an assignment of reals x : ,4 — > (0, 1 — e) 
such that: 

yAeA:Pr[A]^-'<x{A) [| {1 - x{B)). 

Ber{A) 

Suppose further that log 1/(5 < poly(?7), where S ~ mmAeAx{A)Y[Ber{A)i^ ~ ^i-^))- Then for every p > 
pQiy(^n^ the set {Ai ^ A : Pr [Ai] > p} has size at most poly(n), and is thus essentially always an efficiently 
verifiable core subset of A. If this is the case, then there is a Monte Carlo algorithm that terminates after 
log ^) resamplings and returns a good assignment with probability at least 1 — n~'^, where c > is any 
desired constant. 



Proof. For a probability p — l/poly(n) to be fixed later wc define A' as the set of events with probability 
at least p. Recall from Theorem 3.1 that J2AeA-'^^ — 0(filog(l/(5)). Since xa > P for A & A' , we get that 
l^'l < 0{n\og{l/6)/p) = poly(n). By assumption A' is efficiently verifiable and we can run the modified 
resampling algorithm with it. 

For every event we have Pr [A] < x'a < 1 — e and thus get a (1 — e)' — (1 — 0(e^))-slack; therefore Theorem 3.1 
applies and guarantees that the algorithm terminates with high probability after log ^) resamplings. 

To prove the failure probability note that for every non-core event A E ^ \ „4', the LLL-conditions with 
the "exponential e-slack" provide an extra multiplicative p~'^ factor over the LLL-conditions in Theorem 
3.1. We have x{A)Pr [Af > Pr[A] Y[Ber'{A)(^ ^ ^(B)) where T'{A) are the neighbors of A in the variable 
sharing graph defined on A' . Using Theorem 2.2 and setting p = n^^'-^^'^\ we get that the probability that 
a non-core bad event A A \ A' holds in the assignment produced by the modified algorithm is at most 
XyiPr [AY < XAn~^^^\ Since core-events are avoided completely by the MT-algorithm, a simple union bound 
over all conditional non-core event probabilities results in a failure probability of at most ^^^nr '^AeA\A' 
Now since, "^AeAXA' — X^AeA' = T = poly{n) holds, we get that wc fail with probability at most n~'^ 
on non-core events while safely avoiding the core. This completes the proof of the theorem. I 



The last theorem nicely completes this section; it shows that in practically all applications of the general LLL 
it is possible to obtain a fast Monte Carlo algorithm with arbitrarily high success probability. The conditions 
of Theorem 3.4 are very easy to check and arc usually directly fulfilled. That is, in all LLL-based proofs 
(with a large number of events Ai) known to us, the set of high-probability events forms a polynomial-sized 
core that is trivially efficiently verifiable, e.g., by exhaustive enumeration. Theorem 3.4 makes these proofs 
constructive without further complicated analysis. In most cases, only some adjustments in the bounds are 
needed to respect the e-slack in the LLL-condition. 
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Remarks 



• Note that the failure probability can be made an arbitrarily small inverse polynomial. This is impor- 
tant since for problems with non-efficiently verifiable solutions the success probability of Monte Carlo 
algorithms cannot be boosted using standard probability amplification techniques. 

• In all applications known to us, the core above has further nice structure: usually the probability of 
an event Ai is exponentially small in the number of variables it depends on. Thus, each event in the 
core only depends on O(logn) many Ai, and hence is usually trivial to enumerate. This makes the 
core efficiently verifiable, even when finding a general violated event in A is NP-hard. 

• The fact that the core consists of polynomially many events with usually logarithmically many variables 
each, makes it often even possible to enumerate the core in parallel and to evaluate each event in parallel. 
If this is the case one can get an RNC algorithm by first building the dependency graph on the core 
and then computing an MIS of violated events in each round (using MIS algorithms such as [3, 35]). 
Using the proof of Theorem 3.1 which is based on some ideas from the parallel LLL algorithm of MT, 
it is easy to see that only logarithmically many rounds of resampling these events are needed. 

• Even though the derandomization of [21] also only requires an "exponential e-slack" in the LLL- 
conditions, applying the techniques used there and in general getting efficient deterministic algorithms 
when m is superpolynomial seems hard. The derandomization in [21] either explicitly works on all 
m events when applying the method of conditional probabilities or uses approximate 0(log TO)-wise 
independent probability spaces which have an inherently poly{m) size domain. 

4 A Constant-Factor Approximation Algorithm for the Santa Claus 
Problem 

The Santa Claus problem is the restricted assignment version of the max-min allocation problem of indivisible 
items. In this section, we present the first efficient randomized constant-factor approximation algorithm for 
this problem. 

In the max-min allocation problem, there is a set C of n items, and m children. The value (utility) of item j to 
child i is > 0. An item can be assigned to only one child. If a child i receives a subset of the items Si Q C, 
then the total valuation of the items received by i is X^jes -P(*'j)- The goal is to maximize the minimum 
total valuation of the items received by any child, that is, to maximize mini'^j^g, p{i, j). (The "minmax" 
version of this "maxmin" problem is the classical problem of makespan minimization in unrelated parallel 
machine scheduling [34].) This problem has received much attention recently [15, 14, 2G, 13, IG, 20, 44]. 

A restricted version of max-min allocation is where each item has an intrinsic value, and where for every 
child i, pij is either pj or 0. This is known as the Santa Claus problem. The Santa Claus problem is NP-hard 
and no efficient approximation algorithm better than 1/2 can be obtained unless P = NP [18]. Bansal and 
Sviridenko [15] considered a linear-programming (LP) relaxation of the problem known as the configuration 
LP, and showed how to round this LP to obtain an O(logloglogm/loglogm)-approximation algorithm for 
the Santa Claus problem. They also showed a reduction to a crisp combinatorial problem, a feasible solution 
to which implies a constant-factor integrality gap for the configuration LP. 

Subsequently, Feige [2()] showed that the configuration LP has a constant integrality gap. Normally such a 
proof immediately gives a constant-factor approximation algorithm that rounds an LP solution along the line 
of the integrality-gap proof. In this case Feige's proof could not be made constructive because it was heavily 
based on repeated reductions that apply the asymmetric version of the LLL to exponentially many events. 
Due to this unsatisfactory situation, the Santa Claus problem was the first on a list of problems reported in 
the survey "Estimation Algorithms versus Approximation Algorithms" [27] for which a constructive proof 
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would be desirable. Using a completely different approach, Asadpour, Feige and Saberi [13] could show that 
the configuration LP has an integrality gap of at most ^. Their proof uses local-search and hypergraph 
matching theorems of HaxcU [oO]. Haxell's theorems are again highly non-constructive and the stated local- 
search problem is not known to be efficiently solvable. Thus this second non-constructive proof still left the 
question of a constant-factor approximation algorithm open. 

In this section we show how our Theorem 3.4 can be used to easily and directly constructivize the LLL-based 
proof of Feige giving the first constant- factor approximation algorithm for the Santa Claus problem. 

It is to be noted that the more general max-min fair allocation problem appears significantly harder. It 
is known that for general max-min fair allocation, the configuration LP has a gap of Q{^/rn). Asadpour 
and Saberi [14] gave an 0{^/rnlii'^ {m)) approximation factor for this problem using the configuration LP. 
Recently, Saha and Srinivasan [44] have improved this to 0( ^™in'^ )- So far the best approximation ratio 
known for this problem due to Chakraborty, Chuzhoy and Khanna is 0{n'^) [20], for any constant e > 0; 
their algorithm runs in 0{n^^'^) time. 

4.1 Algorithm 

We focus on the Santa Claus problem here. We start by describing the configuration LP and the reduction 
of it to a combinatorial problem over a set system, albeit with a constant factor loss in approximation. Next 
we give a constructive solution for the set system problem, thus providing a constant-factor approximation 
algorithm for the Santa Claus problem. 

We guess the optimal solution value T using binary search. An item j is said to be small, if pj < aT, otherwise 
it is said to be big. Here a < 1 is the approximation ratio, which will get fixed later. A configuration is a 
subset of items. The value of a configuration C to child i is denoted by Pi^c — Tlj^cPir ^ configuration 
C is called valid for child i if: 

• Pi,c ^ T and all the items are small; or 

• C contains only one item j and pi,j = pj > aT, that is. j is a big item for child i. 

Let C{i,T) denote the set of all valid configurations corresponding to child i with respect to T. Wc define 
an indicator variable yi,c for each child i and all valid configurations C S C{i,T) such that it is 1 if child i 
receives configuration C and otherwise. These variables are relaxed to take any fractional value in [0, 1] to 
obtain the configuration LP relaxation. 



yj--T.T.y^^c<i (7) 

ceC{i,T) 
Vi, C : y,^c > 

Bansal and Sviridenko showed that if the above LP is feasible, then it is possible to find a fractional allocation 
that assigns a configuration with value at least (1 — e)T to each child in polynomial time. 

The algorithm of Bansal and Sviridenko starts by solving the configuration LP (7). Then by various steps 
of simplification, they reduce the problem to the following instance: 
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There are p groups, each group containing I children. Each child is associated with a collection of k items 
with a total valuation of ^, for some constant c > 0. Each item appears in at most (31 sets for some /3 < 3. 
Such an instance is referred to as (k, I, (3) -system. 

The goal is to efficiently select one child from each group and assign at least [7/0] items to each of the chosen 
children, such that each item is assigned only once. If such an assignment exists, then the corresponding 
{k, I, /3)-system is said to be 7-good (fc, I, /3)-system. 

Feige showed that indeed the (fc, I, /3)-system that results from the configuration LP is 7-good, where 7 = 
O (^ ^^^(^i ^-( ^ [26]. This established a constant factor integrality gap for the configuration LP. However, the 
proof being non-constructive, no algorithm was known to efficiently find such an assignment. In the remaining 
of this section, we make Feige's argument constructive, thus giving a constant-factor approximation algorithm 
for the Santa Claus problem. But before that, for the sake of completeness, we briefly describe the procedure 
that obtains a (/c, /, ^)-system from an optimal solution of the configuration LP [15]. 

4.2 From a configuration LP solution to a (k, I, /3)-system 

The algorithm starts by simplifying the assignment of big items in an optimal solution (say) y* of the 
configuration LP. Let Jb denote the set of big items. Consider a bipartite graph G with children M on the 
right side and big items Jb on the left side. An edge (i, j), i € M,j G Jb of weight Wij = J2j£C{i T) vtc 
inserted in G if Wij- > 0. These values are then modified such that after the modification the edges of 
G with weight in (0,1) form a forest. 

Lemma 5 [15]. The solution y* can be transformed into another feasible solution of the configuration LP 
where the graph G is a forest. 

The transformation is performed using the simple cycle-breaking trick. Each cycle is broken into two match- 
ings; weights on the edges of one matching arc increased gradually while the weights on the other are 
decreased until some weights hit or 1. If a Wi,j becomes in this procedure, the edge (i, j) is removed from 
G. Else if it becomes 1, then item j is permanently assigned to child i and the edge {i,j) is removed. 

Suppose G" is the forest obtained after this transformation. The forest structure is then further exploited to 
form groups of children and big items. 

Lemma 6 [15]. The solution y* can be transformed into another solution y' such that children M and big 
items Jb can be clustered into p groups Mi, M2, ■ . ■ , Mp and Jb,i, Jb2 t ■ ■ -i respectively with the following 
properties. 

1. For each i ~ 1,2, ... ,p, the number of jobs JB,i in group Mi is exactly \Mi\ — 1. The group Jb^i could 
possibly be empty. 

2. Within each group the assignment of big job is entirely flexible in the sense that they can be placed 
feasibly on any of the \Mi\ — 1 children out of the \Mi\ children. 

3. For each group Mi, the solution y' assigns exactly one unit of small configurations to children in Mi 
and all the \Mi\ — 1 units of configurations correspond to big jobs in Jb i- Also, for each small job j , 

Lemma 6 implies that the assignment of big items to children in a group is completely flexible and can 
be ignored. We only need to choose one child from each group who will be satisfied by a configuration of 
small items. Let y' assigns a small configuration C to an extent of y'^^ ^'^ some child c G Mi,i € [l,p], 
then we say that Mi contains the small configuration C for child c G Mi. Without loss of generality, it 
can be assumed that each child in the groups is fractionally assigned to exactly one small configuration. 
Bansal and Sviridenko further showed that y' can again be simplified such that each small configuration is 
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assigned to at least to an extent of j = i^^;^ to each child and for each small job j, J2c3j Si u'l c — "^^ '^^'^^^ 
impUes, if we consider all the small configurations across p groups, then each small job appears in at most 
131 configurations, where /3 = 3. 

Finally, the following lemma shows that by losing a constant factor in the approximation, one can assume 
that all the small jobs have same size. 

Lemma 8 [15]. Given the algorithmic framework above, by losing a constant factor in the approximation, 
each small job can be assumed to have size — . 

As a consequence of the above lemma, we now have the following scenario. 

There are p groups Mi,M2, ■ ■ ■ ,Mp, each containing at most I children. Each child is associated with a set 
that contains k = items. Each item belongs to at most (31 sets. The goal is to pick one child from each 

group and assign at least a constant fraction of the items in its set such that each item is assigned exactly 
once. 

Therefore, we arrive at what is referred as a (fc, /, /3)-system. 

4.3 Construction of a 7-good solution for a {k, /, /3)-system 

We now point out the main steps in Feigc's algorithm, and in detail, describe the modifications required to 
make Feige's algorithm constructive. 

Feige's Nonconstructive Proof for 7-good (fc, I, /3)-system: Feige's approach is based on a systematic 
reduction of k and I in iterations, finally arriving to a system where k or I are constants. For constant k or 
I the following lemma asserts a constant 7. 

Lemma 4.1 (Lemma 2.1 and 2.2 of [2()]). For every (k, I, /3) -system a ^-good solution with 7 satisfying, 
7 = or ^k = L^^J '^'^'^ found efficiently. 

The reduction of {k, I, /3)-system to constant k and / involves two main lemmas, which we refer to as Reduce-l 
lemma and Reduce-k lemma respectively. 

Lemma 4.2 (Lemma 2.3 of [2(i], Reduce-l). For I > c (c is a sufficiently large constant), every j-good 
{k,l, (3)-system with k < I can be transformed into a "f-good {k,l' , j3')-system with V < log^ Z and j3' < 

Lemma 4.3 (Lemma 2.4 of [2(1], Reduce-k). Every {k,l, j3)- system with k>l>c can be transformed into a 
{k' ,1, (3) -system with k' < ^ and with the following additional property: if the original system is not ^-good, 
then the new system is not ^' -good for 7' = 7(1 + 3^^^). Conversely, if the new system is 'j' -good, then the 
original system was j-good. 

If f3 is not a constant to start with, then by applying the following lemma repeatedly, (3 can be reduced 
below 1. 

Lemma 4.4 (Lemma 2.5 of [2(i]). For I > c, every j-good (k,l, f3)-sy.stem can be transformed into a ^-good 
{k', I, l3')-system with k' = [|J and /?' < § (l ^) . 

However in our context, /? < 3, thus we ignore Lemma 2.5 of [20] from further discussions. 

Starting from the original system, as long as Z > c. Lemma Reduce-l is applied when I > k and Lemma 
Reducc-k is applied when k > I. In this process /3 grows at most by a factor of 2. Thus at the end, I is a 
constant and so is /3. Thus by applying Lemma 4.1, the constant integrality gap for the configuration LP is 
established. 
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Randomized Algorithm for 7-good (fc, /3)-system: There are two main steps in tlie algoritlim. 



1. Sliow a constructive procedure to obtain tire reduced system througli Lemma Reduce-1 and Lemma 
Reducc-k. 

2. Map tlie solution of the final reduced system back to the original system. 
We now elaborate upon each of these. 

4.3.1 Making Lemma Reduce-1 Constructive 

This follows quite directly from [40]. The algorithm picks [log^ l\ sets uniformly at random and independently 
from each group. Thus while the value of k remains fixed, / is reduced to I' = [log^ l\ . Now in expectation 
the value of /3 does not change and the probability that /?' > (3{1 + j^), and hence P'l' > + j^), is at 
most ' /3iog^z < g-iog-''/ _ ^-log^;^ define a bad event corresponding to each element: 

• Aj-. Element j has more than /3'1' copies. 

Now noting that the dependency graph has degree at most klf3l < 61^, the uniform (symmetric) version of 
the LLL applies. Now it is easy to check if there exists a violated event: we simply count the number of times 
an element appears in all the sets. Thus we directly follow [40]; setting Xj^^ = —jj^^, we get the expected 

running time to avoid all the bad events to be 0(plk/l^°^^ ~ 0(p) = 0(m). 

4.3.2 Making Lemma Reduce-k Constructive 

This is the main challenging part. The random experiment involves selecting each item independently at 
random with probability i. To characterize the bad events, we need a structural lemma from [2(i]. Construct 
a graph on the sets, where there is an edge between two sets if they share an element. A collection of sets 
is said to be connected if and only if the subgraph induced by this collection is connected. 

We consider two types of bad events: 

1. Bi. some set has less than k' = ^1 — ^^^^ f items surviving, and 

2. Bi for i > 2: there is a connected collection of i sets from distinct groups whose union originally 
contained at most ijk items, of which more than iS' ^ items survive, where J' = 7 ^1 + ^^^^ • 

If none of the above bad events happen, then we can consider the first fc' items from each set and yet the 
second type of bad events do not happen. These events are chosen such that 7'-goodncss (7' — (5'-|p- < 

7^1 + ^-^^^ ) of the new system certifies that the original system was 7-good. That this is indeed the case 
follows directly from Hall's theorem as proven by Feige: 

Lemma 4.5 (Lemma 2.7 of [-(i]). Consider a collection of n sets and a positive integer q. 

1. If for some 1 < i < n, there is a connected subcollection of i sets whose union contains less than iq 
items, then there is no choice of q items per set such that all items are distinct. 

2. If for every i, 1 < i < n, the union of every connected subcollection of i sets contains at least iq 
(distinct) items, then there is a choice of q items per set such that all items are distinct. 
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Feige showed in [2C] that for bad events of type Bi,i > 1, taking Xi = 2~^'^*'°s'^ is sufficient to satisfy the 
condition (1) of the asymmetric LLL. More precisely, suppose we define, for any bad event B G Ui>i 
T{B) to be as in Section 1.3: i.e., T{B) is the set of aU bad events A B such that A and B both depend 
on at least one common random variable in our "randomly and independently selecting items" experiment. 
Then, it is shown in [21)] that with the choice Xi = 2~i°''°sfc fQj- a,ll events in Bi, we have 

V(i>1)V(BgB,), Pr[S]<2-2"*'°s'=<x,J] (1-^,)- (8) 

]>i Ae(Bjnr(B)) 

Thus by the LLL, there exists an assignment that avoids all the bad events. However, no efficient construction 
was known here, and as Feige points out, "the main source of difficulty in this respect is Lemma 2.4, because 
there the number of bad events is exponential in the problem size, and moreover, there are bad events 
that involve a constant fraction of the random variables." Our Theorem 3.4 again directly makes this 
proof constructive and gives an efficient Monte Carlo algorithm for producing a reduce-k system with high 
probability. 

Lemma 4.6. There is a Monte Carlo algorithm that produces a valid reduce-k system with probability at 
least 1 — l/rri^. 

Proof. Note from (8) that we can take S = 2-20™ log fc^ go, we get that log 1/(5 0(m log fc) = 0{7i\ogn) 
where n is the number of items and m < n is the number of children. We furthermore get that all events 
with probability larger than a fixed inverse-polynomial involve only connected subsets of size 0( \°g™ ) and 
Theorem 3.4 implies that there are only polynomially many such "high" probability events. (This can 
also be seen directly since the degree of a subset is bounded by k/3l < 6k^ and the number of connected 
subcollections is therefore at most (Gfc^)*^^ >°s'= = m*^'^' = n'-'^^K) The connected collections of subsets are 
easy to enumerate using, e.g., breadth-first search and are therefore efficiently verifiable (in fact, even in 
parallel). Theorem 3.4 thus applies and directly proves the lemma. I 

4.3.3 Mapping the solution of the final reduced system back to the original system 

By repeatedly applying algorithms to produce Reduce-1 or Reduce-k system, we can completely reduce down 
the original system to a system with a constant number of children per group, where /? can increase from 
3 to at most 6 due to Lemma Reduce-1. This involves at most logm Reduce-1 reductions and at most log 71 
Reduce-k reductions. We can furthermore assume that n < 2™ since otherwise simply all combinations of 
one child per group could be tried in time polynomial in n. Since, each Reduce-1 or Reduce-k operation 
produces a desired solution with probability at least 1 — by union bound, with probability at least 
1 — 0(log n \ogm/m?) = 1 — 0(logTO/m) a final (fc, /, /3)-system is produced that is 7-good for some constant 
7 by Lemma 4.1. Using Lemma 4.1, we can also find a 7-good selection of children. Now, once one child 
from each group is selected, we can construct a standard network fiow instance to assign items to these 
chosen children (Lemma 4.8). This finishes the process of mapping back a solution of the reduced system to 
the original (fc, /3)-system. While checking whether an individual reduction failed seems to be a NP-hard 
task, it is easy to see in the end whether a good enough assignment is produced. This enables us to rerun 
the algorithm in the unlikely event of a failure. Thus, the Monte Carlo algorithm can be strengthened to an 
algorithm that always produces a good solution and has an expected polynomial running-time. 

The details of the above arc given in two lemmas. Lemma 4.7 and Lemma 4.8. Theorem 4.9 follows from 
the two lemmas. 

Suppose we start with a (fci, /i, /3i)-system and after repeated application of either Lemma Reduce-1 or 
Lemma Rcducc-k reach at a (fcs, ^s, /5s)-system, where < c, a constant. We then employ Lemma 4.1 to 
obtain a 7s-good (fc^, Z^, /3s)-system, where 7^ satisfies jsks = Lj^i^J- Since Is is a constant and (3s < 6, 
7s is also a constant. Lemma 4.1 also gives a choice of a child from each group, denoted by a function 
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f : {1, . . . ,p} {1, . . . ,ls} that serves as a witness for 7s-goodness of {kg, Is, /3s)-system. We use this same 
mapping for the original system. The following lemma establishes the goodness of the (/ci, /i, /3i)-system. 

Lemma 4.7. Given a sequence of reductions of k, (fci, li, Pi) — >■ ... — >■ (kg, Is, Ps), interleaved with reductions 
of I, let for all s > 2, 7^ = 7s_i(l H — ^, °^ ), Then if the final reduced system is 'js-good and the function 

f : {l,...,p} — > {l,...,ls} serves as a witness for its ^s-Qoodness, then f also serves as a witness of j- 
goodness of {ki,li, Pi) system with high probability. In other words, we can simply use the assignment given 
by f to select one child from each group and that assignment serves as a witness of ^-goodness of the original 
system with high probability. 

Proof. Suppose there exists a function / that serves as a witness for 7s-goodness of the [ks, Is, Ps)-system, but 
does not serve as a witness that (ks-i, Is-i, /3<,_i)-system is 7s_i-good. Then there must exist a connected col- 
lection of i, i > sets chosen from p groups according to /, such that their union contains less than 7s-i^"s-i* 
items. However in the reduced system, their union has jsks-ii elements. Call such a function / bad. Thus 
every bad function is characterized by a violation of event of type Bi,i > 1, described in Section 4.3.2. 
However, by Lemma 4.6 we have Pr [3 a bad function /] < Pr [an event of type Bi,i > 1 happens] < 

Now the maximum number of times the Reduce-k step is applied is at most log/ci < logn. Thus if the 
Rcducc-1 step is not applied at all, then by a union bound, function / is 7-good for the (/ci, /i, /3i)-system 
with probability at least 1 - We can assume without loss of generality that n < 2™. (Otherwise in 

polynomial time we can guess the children who receive small items and thus know /. Once / is known, an 
assignment of small items to the children chosen by / can be done in polynomial time through Lemma 4.8.) 
Since n < 2™, function / is 7-good for the (fci, Zi, /3i)-system with probability at least 1 — logm/m. Now 
since the Reduce-1 step only reduces I and keeps k intact, it does not affect the goodness of the set system. 

Once we know the function /. using Lemma 4.8, we can get a valid assignment of \_k^\ items to each chosen 
child: 

Lemma 4.8. Given a function f : {!,..., p} — >■ {1,...,/}, and parameter 7, there is a polynomial time 
algorithm to determine, whether f is ^-good and we can determine the subset of lkj\ items received by each 
child f(i), i G 

Proof. We construct a bipartite graph with a set of vertices U = {\, . . . ,p} corresponding to each chosen 
child from the p groups, a set of vertices V corresponding to the small items in the sets of the chosen children, 
a source s and a sink t. Next we add a directed edge of capacity [7^] from source s to each vertex in U . We 
also add directed edges [u, v),u G U,v & V, if the item u belongs to the set of v. These edges have capacity 
1. Finally we add a directed edge from each vertex in V to the sink t with capacity 1. We claim that this 
flow network has a maximum flow of [^7]^ iff / is 7-good: 

For the one direction let / be 7-good. Thus there exists a set of [7/0] elements that can be assigned to each 
child u £ U. Send one unit of flow from each child to these items that it receives. The outgoing flow from 
each u e ?7 is exactly [7/2] . Since each item is assigned to at most one child, flow on each edge {v,t),v E V 
is at most 1. Thus all the capacity constraints are maintained and the flow value is [7fcjp. 

For the other direction consider an integral maximum flow of [fc7jp. Since the total capacity of all the edges 
emanating from the source is [fc7jp, they must all be saturated by the maxflow. Since the flow is integral, 
for each child u there are exactly [7fcJ edges with flow 1 corresponding to the items that it receives. Also 
since no edge capacity is violated, each item is assigned to exactly one child. Therefore / is 7-good. 

To check a function / for 7-goodness and obtain the good assignment we construct the flow graph and run 
a max flow algorithm that outputs in an integral flow. As proven above a max flow value of [fc7jp indicates 
7-goodness and for a 7-good / the assignment can be directly constructed from the flow by considering only 
the flow carrying edges. I 
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Theorem 4.9. There exists a constant a > and a randomized algorithm for the Santa Claus problem that 
runs in expected polynomial time and always assigns items of total valuation at least a ■ OPT to each child. 

5 Non-repetitive Coloring of Graphs 

In this section, we give an efficient Monte-Carlo construction for non-repetitive coloring of graphs. Call a 
word (string) w "squarefree" or "non-repetitive" if there does not exist any string of the form w = xx, where 
X 7^ 0. Let us refer to graphs using the symbol H instead of G, to not confuse with our dependency graphs 
G. Recall from Section 1 that a fc-coloring of the edges of H is called non-repetitive if the sequence of colors 
along any path in H is squarefree: i.e., we want a coloring in which no path has a color-sequence of the form 
XX. (All paths here refer to simple paths.) The smallest k such that H has a non-repetitive coloring using k 
colors is called the Thue number oi H and is denoted by tt{H). The Thuc number was first defined by Alon, 
Grytczuk, Hauszczak and Riordan in [5]: it is named after Thue who proved in 1906 that if is a simple 
path, then tt{H) = 3 [47]. While the method of Thue is constructive, no efficient construction is known 
for general graphs. Alon et al. showed through application of the asymmetric LLL that tt{H) < cA(i7)^ 
for some absolute constant c. Their proof was non-constructive. The number of bad events is exponential. 
Not only that, checking whether a given coloring is non- repetitive is coNP-Hard, even when the number of 
colorings is restricted to 4 [3G]. Thus checking if some "bad event" holds in a given coloring is coNP-Hard. 
Since the work of Alon et al., the non- repetitive coloring of graphs has received a good deal of attention in 
the last few years [22, 45, 29, 32, 19, 4]. Yet no efficient construction is known till date, except for some 
special classes of graphs such as complete graphs, cycles and trees. 

5.1 Randomized algorithm for obtaining a non-repetitive coloring 

Suppose we are given a graph H with maximum degree A. We first give the proof of Alon et al. which shows 
that 7t{H) < cA^, and then show how to convert this proof directly into a constructive algorithm (with the 
loss of a A*^ factor in the number of colors used): 

Theorem 5.1 (Theorem 1 of [5]). There exists an absolute constant c such that tt{H) < cA^ for all graphs 
H with maximum degree at most A. 

Proof. Let C = (2e^® -I- 1)A^. Randomly color each edge of H with colors from C. Consider the following 
types of bad events Bi, for i > 1: "there exists a path P of length 2i, such that the second half of P is 
colored identically to its first half" . 

We have for a path P of length 2i,i > 1, Pr [P has coloring of the form xx] ^ -~. Also, a path of length 2i 
intersects at most AijA'^^ paths of length 2j. Thus, for any bad event A of type i, we have Pr [A] = and 
that each bad event of type i share variables with at most 4ijA^-' bad events of type Bj. Set Xi = -^r^- 
We have (1 — Xj) > e~'^^^; this, along with the fact that X]j>i = 2, shows that 

^.11(1 - ^ > ^ = (2ei«A2)-\ 

3 

Since C = (2e^^ -I- 1)A^, the condition of the LLL is satisfied and we are guaranteed the existence of such a 
non-repetitive coloring. I 

Now we see that using just a slightly higher number of colors suffices to make Theorem 3.4 apply. 

Theorem 5.2. There exists an absolute constant c such that for every constant e > there exists a Monte 
Carlo algorithm that given a graph H with maximum degree A, produces a non-repetitive coloring using at 
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most cA^^*^ colors. The failure probability of the algorithm is an arbitrarily small inverse polynomial in the 
size of H . 

Proof. We apply the LLL using the same random experiments and bad events as in Theorem 5.1 but with 
C = C 1-=' colors such that C < cA^+'^. Using the same settings for xa gives an exponential e' slack in the 

LLL-conditions since the probability of a bad event of type i is now at most ^ = (^) . Recall Theorem 
3.4. Clearly, log 1/(5 — 0{n^) and so the last thing to check to apply Theorem 3.4 is that for any inverse 
polynomial p, the bad events with probability at least p are efficiently verifiable. Here these events consist 
of paths smaller than a certain length (of the form 0((l/e) logn/ log A), where n is the number of vertices), 
and Theorem 3.4 guarantees that there are only polynomially many of these. Using breadth-first-search to 
go through these paths and checking each of them for non-repetitiveness is efficient and thus Theorem 3.4 
directly applies. I 

6 Ramsey-type Bounds 

In this section, we briefly sketch another application of our method, namely the construction of Ramsey-type 
graphs. 

The Ramsey number R{K'^ , K*) is the smallest number I such that for any n> I and in any red-blue coloring 
of the edges of i^T", there either exists a with all red edges or a if* with all blue edges. Here, if for any 
integer a denotes a clique of size a as usual. The fact that these numbers are finite for all s, m is a special case 
of Ramsey's well-known theorem (see e.g. [43]). In one of the first applications of probabilistic methods in 
combinatorics Erdos showed the lower bound of R{K"^ , K"^) = J7(m2™/^) [25]. Since, then obtaining lower 
bounds on R{K^ ,K^) and constructing Ramsey graphs avoiding "large" cliques as well as "large" independent 
sets simultaneously has attracted much attention [2, 9, 31, 7]. The case of fixed s is the main example case for 
off-diagonal Ramsey numbers. Alon and Pudlak gave an explicit deterministic construction for off-diagonal 
Ramsey graphs in [2]. They showed constructively for some e > 0, R{K',K*) > t Vi°g i°g i°g ^ The 

best known bound for R{K^,K^) can be obtained using LLL, R{K'',K*) = Ofj^j [7]. Krivelevich 

gave a Monte Carlo algorithm matching this bound through large deviation inequalities [31]. In addition, 
Krivelevich considered related Ramsey type problems, for example, he showed that there exists a K^-iree 
graph on n vertices in which any o{n^^^ log^^^ n) set of vertices does not contain a K^. The problem of 
finding constructions for Ramsey type graphs matching the best known bounds is of great interest and may 
have algorithmic applications as well. 

Using our method, we can achieve the best known bound for off-diagonal Ramsey numbers, that is, for fixed 
s, by directly making the LLL-based proof [7] constructive. More importantly, we can provide randomized 
(Monte Carlo) constructions of graphs on n vertices of the form: "there is no subgraph U in any set of s 
vertices and no subgraph W in any set of t vertices", where t can be large, typically n®^^^ - the existence 
of which can be proved using the LLL (often using appropriate random-graphs G{n,p)). When U = K^, 
W = and s is fixed, we get the off-diagonal Ramsey number. We refer to these as general Ramsey-type 
graphs. This is a direct generalizations of related Ramsey type problems considered by Krivelevich [31]. 

When U and W are some special subgraphs, few results are known. As mentioned earlier Krivelevich 
considered the case where U — K^, W = , s — 4, and t ~ o(n'^/^ log^^^ n). In addition, he also showed 
constructions for arbitrary [/, but when W = A'* [ • ]. Alon and Krivelevich in [7] and Krivelevich in 
[31] considered a Ramsey- type bound R'{K^ ,K'^{t)): the smallest number I such that for any n > £, any 
graph on n vertices either contains a if or there exists a set of t vertices containing a K^. When r — 2, 
R{K'^,K*) = R' {K'^ , K'^{t)). However, to the best of our knowledge, no general algorithmic result avoiding 
any subgraph U and W on any set of s and t vertices respectively is known till date. 
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Briefly, the idea is as follows. Suppose, as in the typical existence-proofs for such graphs, we are able to show 
using the (asymmetric) LLL that for a suitable p = p{n) , the random graph with n vertices and independent 
edge-probability p, satisfies all the required properties with positive probability. Theorem 3.4 will typically 
immediately apply if we allow an exponential e-slack. When s is fixed, another related approach is to apply 
Theorem 3.3 and to only verify the events that correspond to s-sized subgraphs; since s is fixed, these can be 
enumerated and verified efficiently. Note that as pointed out in [10], the LLL may be much more significant 
in improving the bounds with fixed s and this is generally the case of interest while applying the LLL based 
arguments. 

7 Acyclic Edge Coloring 

Given a graph G, an acyclic edge coloring is a proper edge coloring of G where each cycle receives more than 
2 colors. In a proper edge coloring, no two incident edges receive the same color. In addition, here we require 
that no cycle receives only 2 colors. The goal is to use a minimum number of colors (known as the acyclic 
chromatic number a{G)) and obtain an acyclic edge coloring. The concept of a(G) was introduced way back 
in 1973 [2.S] and has been studied by a series of researchers over the years [8, 37, 11, 28, 42]. In all these 
works, the asymmetric LLL is applied to achieve the best non-constructive bounds. Thus an algorithmic 
version of the local lemma strikes as the first choice to obtain an acyclic edge coloring. 

Alon, McDiarmid and Reed [8] showed that a(G) < 64A, where A is the maximum degree. The constant was 
later improved to 16 by Molloy and Reed [37], but the proof still was non-constructive. Both the methods are 
essentially the same: randomly color each edge from a pool of colors {1,2,..., C}. They define a series of 
bad events, where Type 1 bad event corresponds to two incident edges e, / receiving the same color and Type 
k bad event implies a cycle of length 2k getting 2 colors. A cycle of odd length automatically gets 3 colors, 
if the coloring is proper. Note that the number of Type k events for non-constant k is super-polynomial 
in the number of edges of G. The probability of Type 1 event is 1/C and the probability of Type k event 
is l/C^^'^"^-'. Let A be the maximum degree of G. It is now an easy exercise to verify that each Type k 
event depends on at most 4fcA Type 1 events and 2fcA^('~^) Type I events. With this dependency, setting 
C = 16A, Xe = ^ for each edge e and Xk — {^Y'^'^ for each cycle of length 2k satisfies the asymmetric 
LLL condition 1.1. We can turn this proof to an algorithm using 16 A colors as a direct corollary of Theorem 
3.1. 

Theorem 7.1. There is a randomized algorithm that produces a valid acyclic coloring of any graph with n 
edges and maximum degree A in expected polynomial time using 16A colors. 

Proof. Recall from Section 1.3 that S needs to be at least as high as the smallest upper bound on a probability 
of a bad event which is l/C^^*^"^^ for an event of Type k. This gives that log 1/5 is at most 0(n log A). 
Further, it is easy to check for a violated bad event in O(A^n) time: consider subgraphs on every pair of 
colors and check if there is a cycle in it. Therefore Theorem 3.1 directly applies (we can set e = in it) and 
we get that the expected running time is 0{A^n) ■ n ■ 0{n log A) = 0(n'^ A^ log A). Note that this is far from 
tight. We can, for example, exploit that there is already a e-slack in the analysis to get a smaller number of 
resamplings from Theorem 3.1. I 

Whereas this gives an efficient way to obtain acyclic edge coloring using 16A colors and thus matching the 
bound known non-constructivcly so far; the conjectured bound for a{G) is A -f- 2. Alon, Sudakov and Zaks 
showed indeed the conjecture is true for graphs having girth f2(Alog A) [11]. Their algorithm can be made 
constructive using Beck's technique [17] to obtain an acyclic edge coloring using A + 2 colors, albeit for 
graphs with girth significantly larger than 0(AlogA). We bridge this gap by providing constructions to 
achieve the same girth bound as in [11], yet obtaining an acyclic edge coloring with only A 4- 2 colors. 
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The proof of Alon, Sudakov and Zaks again relies on the asymmetric LLL, but their procedure for random 
coloring is different from [8, 37]. They first perform a proper coloring of the edges of G using A + 1 colors 
[48]. Next each edge is switched to color A + 2 with probability 1/32 A. Three types of bad events are 
defined. In type 1 events, two incident edges e, / are colored with (A + 2)th color. Type 2 events correspond 
to the case where no edge of a previously bichromatic cycle switches its coloring to the (A + 2)th color. Type 
3 events correspond to the case where a cycle with half of its edges (every other edge) having the same color 
after the first step, receives (A + 2)th color on half of its remaining edges resulting in a bichromatic cycle. 
It is sufficient to avoid these three types of events to ensure an acyclic edge coloring. It has been shown 
in [11] that by setting the values of x variables to 1/512A^, 1/128A2 and 1/2^/^^ for events of Type 1, 2, 
and 3 (with cycles of length 2fc), conditions of the asymmetric LLL (Theorem 1.1) satisfy. Converting this 
non-constructive proof into an algorithm using our method is an easy exercise. We state the theorem below, 
whose proof is similar to Theorem 7.1 above, and is left as an exercise. 

Theorem 7.2. There is a randomized algorithm that produces a valid acyclic coloring in expected polynomial 
time using (A + 2) colors for graphs having girth ri(Alog A). 

Further non-asymptotic results are known for graphs with sufficiently large girth. Muthu, Narayanan and 
Subramanian showed a(G) < 6A for graphs with girth at least 9 and a(G) < 4.52A for graphs with girth at 
least 220 [42]. Their proofs can also be made constructive using essentially the same proof of Theorem 7.1. 

8 Beyond the LLL Threshold 

This section sketches another application of using the properties of the conditional LLL-distribution intro- 
duced in Section 2 in a slightly different way. While all results presented so far rely on a union bound over 
events in the LLL-distribution, we use here the linearity of expectation for further probabilistic analysis of 
events in the LLL-distribution. This already leads to new non-constructive results. Similar to the other 
proofs involving the LLL-distribution in this paper, this upper bound can be made constructive using The- 
orem 2.2. Considering that the LLL-distribution approximately preserves other quantities such as higher 
moments, we expect that there is much more room to use more sophisticated probabilistic tools like concen- 
tration bounds to give both new non-constructive and constructive existence proofs of discrete structures 
with additional strong properties. 

The setting we wish to concentrate on here is when a set of bad events is given from which not necessarily 
all but as many as possible events are to be avoided. The exemplifying application is the well known 
MAX-/c-SAT problem which in contrast to fc-SAT asks not for a satisfying assignment of a fc-CNF formula 
but for an assignment that violates as few clauses as possible. Given a fc-CNF formula with m clauses a 
random assignment to its variables violates each clause with probability 2"*^ and thus using linearity of 
expectation it is easy to find an assignment that violates at most 7ti2^'° clauses. If on the other hand each 
clause shares variables with at most 2'°/e — 1 other clauses then the LLL can be used to proof the existence 
of a satisfying assignment (which violates clauses) and the MT algorithm can be used to find such an 
assignment efficiently. But what happens when the number of clauses sharing a variables is more than 2'' /el 
Lemma 8.1 shows that a better assignment can be constructed if it is possible to find a sparsely connected 
sub-formula that satisfies the LLL-condition. 

Lemma 8.1. Suppose F is a k-CNF formula in which there exists a set of core clauses C with the property 
that: (i) every clause in C shares variables with at most d < 2^ je — 1 clauses in C, and (ii) every clause in C 
shares variables with at mo st 7(2'=/e- 1) many clauses in C , for some 7 > 0. Let n and m denote the total 
number of variables and clauses in F, respectively. Then, for any > l/poly(n,m), there is a randomized 
poly (n^m) -time algorithm that produces, with high probability, an assignment in which all clauses in C are 
satisfied and at most an (1 + 9)2~^e'^ fraction of clauses from C are violated. (If we are content with 
success-probability p — n^'^ for some constant c, then there is also a randomized algorithm that runs in time 
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poly(n, |C|), satisfies all clauses in C , and violates at most n (l/p) • 2 ^e^ fraction of clauses from C . This 
can he useful if \C\ <^ m.) 

Proof. Briefly, the idea above is as follows. Suppose we do tlie obvious random assignment to the variables: 
each is set to "True" or "False" uniformly at random and independently. For any clause Ci, let Ai be the 
bad event that it is violated in such an assignment. It is well-known that we can take x(Ai) — e/2^ for all 
Ci G C, and avoid all of these events with positive probability: this can be made constructive using MT. 
Suppose we run the MT algorithm for up to n'^ times its expected number of resamplings. By Markov's 
inequality, the probability of MT not terminating by then is at most n""^. Furthermore, the probability that 
at the end of this process, some clause € C is violated can be bounded (using part (ii) of Theorem 2.2) 
by the following: 

2-^ ■ (1 - e/2'=)-^(2Ve-i) < g7 . 

Thus, the expected fraction of clauses from C that are violated in the end, is at most 2^^e^ . Markov's 
inequality and a union bound (for a sufficiently large choice of c) complete the proof. I 

Along these lines we aim to develop a general result that can be applied in cases where the number of 
dependencies are (slightly) beyond the LLL-threshold. For this, suppose we have a system of independent 
random variables V = {Pi, P2, • • ■ , ^n} and bad events A — {Ai, A2, . . . , Am} with dependency graph 
G = Gj\^ as in the introduction. Let us consider the symmetric case in which Pr[Ai] < p for each i. Again 
there are only two types of constructive results known in general, in terms of only allowing a "small" number 
of the bad events to happen. It is easy to have only about mp of the Ai happen - without any assumptions 
about G - just by using the linearity of expectation. On the other hand, if the maximum degree of G 
is at most 1 / {ep) — 1 , the conditions of the symmetric LLL and the algorithm of [40] , guarantee that we 
can efficiently ensure that none of the Ai happen. Interpolating between these two extremes Theorem 8.3 
characterizes the fraction A of events that can be avoided if the maximum degree of G is by a factor of 
a > 1 larger than the LLL-threshold 1 / [ep) — 1. To the best of our knowledge virtually nothing was known 
(even non-constructively) in this setting. Theorem 8.3 is obtained by using the probabilistic method to 
construct a sparsely connected core that satisfies the LLL-conditions with a sufficiently large gap. Using the 
linearity of expectation in the analysis of the LLL-distribution with respect to this core the existence of a 
good assignment can be proven: 

Definition 8.2. For any ol> Q, let \{a) he the smallest numher satisfying the following: 
For any setting of the standard form "variahles V and had events A" in which 
(a) the prohahility of any event A £ A is at most p — o(l) and 

(h) the maximum degree in the variahle- sharing (dependency) graph is at most d = a(l/(ep) — 1), 
there exists an assignment to variahles in V that violates at most (1 -|- o(l))X(a) ■ mp events. 

Theorem 8.3. The fraction X{a) is upper hounded as follows: 

• Va < 1 : A(a) = 0; 

• VI < a < e; X{a) < eln(Q;)/Q; < 1; 

• Va > e : A(a) = 1. 

Proof. If a < 1 then the standard symmetric LLL and the MT algorithm ensure that no bad event holds. 
On the other hand if a > e, then consider m = 1/p events given by a single random variable X which is 
uniformly distributed in {1, 2, . . . , 1/p}; the ith event holds iS X = i. We have d = 1/p — 1 here, and exactly 
one bad event holds with probability one. Thus, we cannot do better than the obvious bound of mp if a > e. 

For our main case where the constant a satisfies 1 < a < e, we employ the probabilistic method to first 
determine a core subset of the bad events, and then apply Theorem 2.2. We give a proof sketch here. Since 



21 



p = 0(1), d ~ We will pick a suitable e = o(l) and an appropriate constant /3 > 1; d can be assumed 

sufficiently larger than /3 since d = w(l). Choose a random subset of the events Ai by choosing each 
event independently with probability (1 — e)/af3 and then eliminate all events from A that have more than 
d/{af5) neighbors in A' . The Chcrnoff bound shows that with probability 1 — cxp(— r2(c?e^//3)) (which is 

1 — 0(1) for a suitable choice of e and /3), the core A' has at least an ^ fraction of the events, and at 
most an exp(— c?e^//3) = o{p) fraction of the events get eliminated from A. Therefore, there is a core A' of 
size at least (1 — o(l))m/(a/3) to which all events that are not eliminated (a fraction of (1 — o{p)) events) 
have at most d/{aP) neighbors. If we take xa ^ ^a/d for all A E A' for a suitable 7 e [0, 1] then the core 
A' satisfies the LLL-conditions. This is the case if 7 satisfies 

a/{ed) < i-fa/d) ■ (1 - -fa/df^'-^"'^; 

i.e., 

l/e<7e-^/^ (9) 

suffices for d large enough. 

Now we can apply Theorem 2.1 to obtain bounds on the probability of events in the conditional LLL- 
distribution D that avoids all events in A' . It implies that a random assignment that avoids all core-events 
in A' makes an event A Cz {A\A') that is not eliminated true with probability at most 

Pro [A] = Pr [A] /(I - -/a/d)-'^^'^'^'^^ ~ e^//^ ■ p. (10) 

By the linearity of expectation applied to all "non-core" events A E {A \ A') and using (10), the expected 
total number of events Ai that happen in such an assignment is at most 

(1 + 0(1)) • • (1 - l/(a/3)) • e''/^, (11) 

assuming that (9) holds and that e = o(l) can be chosen suitably. From (9), we can take 1//? = (1 — o(l)) • 
(1 -I- ln(7))/7. Plugging this into (11), we see that the optimal choice of 7 is 1/a. (Any choice of e = o(l) 
that satisfies exp(— de^//?) = o{p), will suffice for this argument.) Substituting these choices into (11) yields 
the theorem. I 

Theorem 8.4. Theorem 8.3 can he made constructive for any a > and any efficiently verifiable A (the 
verification in this case is allowed to take poly(n,m) time) that satisfies the conditions from Definition 8.2. 
That is, there is a poly(n, m)-tzme randomized algorithm to set values for the variables in V , such that the 
expected number of events Ai that hold is at most (1 + o{l))X(a) ■ mp. 

Proof. If a < 1 then the theorem follows directly from [40] and for a > e a random assignment suffices. 
For 1 < a < e we make the proof from Theorem 8.3 constructive. For it suffices to see that the success 
probability of the random experiment that creates the core can be made arbitrarily high by choosing e and 
/3 accordingly. This makes the probabilistic method used there directly constructive. Finally we use again 
our main Theorem 2.2 from Section 2 which states that the MT algorithm can be used to efficiently sample 
the LLL-distribution used in the proof of Theorem 8.3. We simply output the assignment which is produced 
by the MT algorithm in time poly(n, m) and the proof of Theorem 8.3 guarantees that the expected number 
of violated bad events in this assignment is at most (1 -|- o(l))A(a) • mp as desired. I 

Remark. Interestingly, we can use our LLL framework instead to construct the core in the proof of Theo- 
rem 8.3. This gives a larger core than what is obtained by uniform random core selection and thus slightly 
sharper results. Briefly, the idea is as follows. For parameters a, /3, and 7 that are similar to those in that 
proof, we start with essentially the same random process for constructing the core: for a sufficiently large 
constant ci (even something slightly smaller than 3 will suffice) , we set e = ci ^(In d)/d, and place each event 
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Ai in the core independently with probabihty 9 = (1 — e)/{af5). Letting Xi denote the (random) number of 
neighbors that At has in the core, note that the expected value of Xi is at most dO = {\ — e)d/{a/3). Now 
consider the system of bad events Ci (one corresponding to each original event Ai): Ci is the event that 
Xi > d/{al3). Note that each Ci depends on at most others; a Chernoff bound shows that for ci large 
enough, Pr[Ci] < l/{ed^). Thus, the LLL shows that there exists a choice of the core that avoids all the Ci, 
with a common value x = x{Gi) for all i such that x — Q{l/d^)\ run the Moscr-Tardos algorithm on the 
system of bad events Ci to efficiently get a core that avoids all of the Ci , and let Ni be the indicator random 
variable for i not belonging to the core at the end of this run. We now apply Theorem 2.1 to upper-bound 
the expected size Pr[Afj; = 1] of the non-core. Since Ni depends on at most 1 + d ^- d{d — 1) = -|- 1 
events C;, Theorem 2.1 gives 

Pr[7V. = 1] < (Y^^ - (1 + 0{l/d)) ■ (1 - 6). 

Thus the expected size of the non-core is at most (1 + 0{l/d)) ■ mp ■ {1 — 9) after the above run of Moser- 
Tardos, similar to what the alteration argument in the proof of Theorem 8.3 gives. We can now proceed 
(with one more run of Moser-Tardos) as in the proof of Theorem 8.3. 
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